Information Extraction for Ontology Learning

نویسنده

  • Fabian SUCHANEK
چکیده

In this chapter, we discuss how ontologies can be constructed by extracting information from Web documents. This is a challenging task, because information extraction is usually a noisy endeavor, whereas ontologies usually require clean and crisp data. This means that the extracted information has to be cleaned, disambiguated, and made logically consistent to some degree. We will discuss three approaches that extract an ontology in this spirit from Wikipedia (DBpedia, YAGO, and KOG). We will also present approaches that aim to extract an ontology from natural language documents or, by extension, from the entire Web (OntoUSP, NELL and SOFIE). We will show that information extraction and ontology construction can enter into a fruitful reinforcement loop, where more extracted information leads to a larger ontology, and a larger ontology helps extracting more information.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

NLP Techniques for Term Extraction and Ontology Population

This chapter investigates NLP techniques for ontology population, using a combination of rule-based approaches and machine learning. We describe a method for term recognition using linguistic and statistical techniques, making use of contextual information to bootstrap learning. We then investigate how term recognition techniques can be useful for the wider task of information extraction, makin...

متن کامل

Information Extraction and Ontology Learning Guided by Web Directory

The paper presents our ongoing effort to create an information extraction tool for collecting general information on products and services from the free text of commercial web pages. A promising approach is that of combining information extraction with ontologies. Ontologies can improve the quality of information extraction and, on the other hand, the extracted information can be used to improv...

متن کامل

Bootstrapping an Ontology-based Information Extraction System

Automatic intelligent web exploration will benefit from shallow information extraction techniques if the latter can be brought to work within many different domains. The major bottleneck for this, however, lies in the so far difficult and expensive modeling of lexical knowledge, extraction rules, and an ontology that together define the information extraction system. In this paper we present a ...

متن کامل

On the Need to Bootstrap Ontology Learning with Extraction Grammar Learning

The main claim of this paper is that machine learning can help integrate the construction of ontologies and extraction grammars and lead us closer to the Semantic Web vision. The proposed approach is a bootstrapping process that combines ontology and grammar learning, in order to semi-automate the knowledge acquisition process. After providing a survey of the most relevant work towards this goa...

متن کامل

Integrating Information Extraction, Ontology Learning and Seman

Ontology-based approaches to Knowledge Management promise better access to relevant knowledge by providing a domain-specific vocabulary that is used for describing the contents of knowledge as well as for retrieving that knowledge. Despite the potential benefits, ontology-based approaches require a considerable amount of commitment and expertise in tasks like creating and maintaining ontologies...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013